Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use KvikIO to enable file's fast host read and host write #17764

Open
wants to merge 9 commits into
base: branch-25.04
Choose a base branch
from

Conversation

kingcrimsontianyu
Copy link
Contributor

@kingcrimsontianyu kingcrimsontianyu commented Jan 17, 2025

Description

This PR makes the following improvements on I/O:

  • Remove legacy cuFile integration to simplify code maintenance. Use KvikIO to manage the GDS setting and compatibility mode.
  • Remove file utility classes and functions. Use KvikIO for all file-related operations.
  • Replace in-house implementation of host_read (for file_source) and host_write (for file_sink) with KvikIO's parallel counterpart.
  • Update the documentation on compatibility mode/GDS.

Closes #16418
Issue #17228

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@kingcrimsontianyu kingcrimsontianyu added libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function breaking Breaking change labels Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu self-assigned this Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu requested review from a team as code owners January 17, 2025 20:09
@github-actions github-actions bot added the CMake CMake build issue label Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu marked this pull request as draft January 17, 2025 20:09
Copy link

copy-pr-bot bot commented Jan 17, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu kingcrimsontianyu changed the title Remove legacy cuFile integration I/O improvements Jan 17, 2025
@kingcrimsontianyu kingcrimsontianyu changed the title I/O improvements Improvements on data source and data sink Jan 18, 2025
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu kingcrimsontianyu force-pushed the remove-legacy-cufile-integration branch from 1d2ed1e to 2f045b4 Compare January 18, 2025 13:35
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu kingcrimsontianyu force-pushed the remove-legacy-cufile-integration branch from e6847aa to d1932df Compare January 20, 2025 19:30
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu kingcrimsontianyu changed the title Improvements on data source and data sink Use KvikIO to enable file's fast host read and host write Jan 21, 2025
@kingcrimsontianyu kingcrimsontianyu force-pushed the remove-legacy-cufile-integration branch from d1932df to 183baf8 Compare January 24, 2025 04:06
@github-actions github-actions bot added Python Affects Python cuDF API. Java Affects Java cuDF API. cudf.pandas Issues specific to cudf.pandas cudf.polars Issues specific to cudf.polars pylibcudf Issues specific to the pylibcudf package labels Jan 24, 2025
@kingcrimsontianyu kingcrimsontianyu changed the base branch from branch-25.02 to branch-25.04 January 24, 2025 04:07
@kingcrimsontianyu kingcrimsontianyu removed Python Affects Python cuDF API. Java Affects Java cuDF API. labels Jan 24, 2025
@kingcrimsontianyu kingcrimsontianyu removed cudf.pandas Issues specific to cudf.pandas cudf.polars Issues specific to cudf.polars pylibcudf Issues specific to the pylibcudf package labels Jan 24, 2025
cpp/src/io/utilities/file_io_utilities.hpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/file_io_utilities.cpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/data_sink.cpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/data_sink.cpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/datasource.cpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/datasource.cpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/datasource.cpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/datasource.cpp Outdated Show resolved Hide resolved
cpp/src/io/utilities/datasource.cpp Show resolved Hide resolved
docs/cudf/source/user_guide/io/io.md Show resolved Hide resolved
@kingcrimsontianyu kingcrimsontianyu force-pushed the remove-legacy-cufile-integration branch from f869d9e to 343228a Compare February 4, 2025 20:54
@kingcrimsontianyu kingcrimsontianyu marked this pull request as ready for review February 4, 2025 20:58
@@ -79,7 +79,6 @@ option(CUDA_ENABLE_LINEINFO
option(CUDA_WARNINGS_AS_ERRORS "Enable -Werror=all-warnings for all CUDA compilation" ON)
# cudart can be statically linked or dynamically linked. The python ecosystem wants dynamic linking
option(CUDA_STATIC_RUNTIME "Statically link the CUDA runtime" OFF)
option(CUDA_STATIC_CUFILE "Statically link cuFile" OFF)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have downstream applications depend on this static cuFile library. We probably should keep this option, and forward it to KvikIO (and also add the static cuFile support there). What do you think? @vuule

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no context on why this option exists. Maybe @robertmaynard or @bdice can help here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that it's for Spark builds. It was added in #17315 by @KyleFromNVIDIA so he should confirm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need some help on this. In both KvikIO and cuDF's C++ code, we use dlopen to dynamically load the cuFile shared library at runtime. So I'm wondering (1) why we still need to link to the cuFile shared library via target_link_libraries at compile-time, and (2) how the cuFile static library is useful to us? (3) Is it possible that the compile-time linking is only necessary for our java code, but not C++? Thanks for any pointer!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need the compile-time linking any more for the dynamic library cases if we switch over to using kvikio entirely. I can link you to some Slack threads on that topic. I am less certain about the static library case, in particular what we can assume about cufile's availability as a dynamic library on the target systems where the Spark plugin is deployed.

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, aside from the CUDA_STATIC_CUFILE thread
Thank you for handling this!

// Workaround for https://github.com/rapidsai/cudf/issues/14140, where cuFileDriverOpen errors
// out if no CUDA calls have been made before it. This is a no-op if the CUDA context is already
// initialized.
cudaFree(nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fits perfectly here :)

Copy link
Contributor

@shrshi shrshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.
Projects
Status: In Progress
Status: No status
Development

Successfully merging this pull request may close these issues.

Remove the cuFile (GDS) backend
5 participants